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ABSTRACT 

This paper proposes a human identification system via extracted electrocardiogram (ECG) signals. Two hierarchical 
classification structures based on global shape feature and local statistical feature is used to extract ECG signals. Global 
shape feature represents the outline information of ECG signals and local statistical feature extracts the information 
between signals in time domain. Genetic algorithm based back propagation neural network is used as the specific 
classifier. Experiment results show that our identification system can achieves an average 97.6% accuracy on a 38 
subjects of PTB public ECG database and an average 100% accuracy on an 18 subjects of MIT-BIH public ECG 
database, which demonstrates the proposed system can reach satisfactory effects. 
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1. INTRODUCTION 

Automatic human identification using physiological modality has been widely researched as its significance 
in many security areas. A lot of works have been studied on human identification such as facial features 
(Samaria and Harter, 1994; Nagamine el al, 1992), gait (Kale el al, 2003), fingerprint (Hodges and Pollack, 
2007), and iris (Zhu el al, 2000) etc. However, these biometrics modalities either can not provide reliable 
performance in terms of identification accuracy or are not robust enough against falsification. 
Electrocardiogram (ECG) is a method to measure and record different electrical potentials of the heart, which 
is considered to be a unique system of each person. The main reason to use ECG signals to identify 
individuals is due to its physiological and geometrical differences (Hoekema el al, 2001). 

Recently, ECG signals for human identification have been widely studied. To build an efficient ECG 
human identification system, the very important element is the distinctive features extracted from ECG 
signals. Some methods are proposed for ECG feature extraction. Kyoso and Uchiyama (2001) present a 
system which identifies subjects based on a comparison of a person’s ECG with previously registered ECG 
feature parameters. These feature parameters are sampled from the intervals and durations of the 
electrocardiographic wave extracted using characteristic points appearing on the waveform of the second 
order derivative and are identified using discriminate analysis. Wang el al. (2013) proposed ECG signals for 
human identification based on sparse representation of local segments, which is extracted from an ECG 
signal and projected to a small number of basic elements in a dictionary. Biel et al. (2001) extracted attributes 
that are temporal and amplitude distances between detected fiducial points. They proposed two extraction 
methods called analytic-based method and appearance-based method. 

In this paper, two different features of ECG signal have been extracted as global shape feature and local 
statistical feature. Due to the different representation information of those two kind features, a two 
hierarchical classification structure has been designed mainly spired by the idea of changing large class 
number problem to small class number problem. In the comparison phase, genetic algorithm based back 
propagation neural network (GA-BPNN) combined classifier is used. 
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The rest of this paper is organized as follows. The related works is presented in Section 2. Section 3 
introduces preprocessing of ECG signals and two feature extraction algorithm. The two hierarchical 
classification structure is explained in Section 4. We give the experiment results in Section 5. Finally, our 
paper is ended with the concluding remarks in Section 6. 


2. RELATED WORKS 

Automatic and accurate human identification systems have become increasingly important in several aspects 
of daily lives, such as in access control, financial transactions, electronic commerce and other. Traditional 
strategies to accomplish identification (e.g., “password”, “IDs”) are no longer adequate to satisfy modern 
requirements. Compared to Uaditional methods, biomeUics features are more reliable and secure in verifying 
individuals. There are two main biometrics features for human identification system, which can be refer to 
either physiological or behavioral. Physiological biometrics features commonly include face, fingerprints, 
retina, iris, and etc. Behavioral biometrics features include signature, voice, and etc. 

An ECG signal describes the electrical activity of the heart over time and can be recorded nonin vasively 
using electtodes attached to the surface of the body (Silipo el al, 1996). Advantages of using the ECG for 
biometrics recognition include universality, permanence and uniqueness, robustness to attack, aliveness 
detection, and data minimization (Agrafioti el al, 2011). Previous works about feature vectors measured from 
different parts of ECG signals for classification can be summarized as either fiducial points dependent or 
independent. Fiducial point dependent methods depend on local characteristics of the heartbeat, such as time 
duration, or amplitude differences between fiducial points. The non-fiducial point approaches exUact features 
statistically based on overall morphology of waveform (Agrafioti el al, 2011). 

Biel el al. (2001 ) used an equipment called SIEMENS to ttansfer and convert ECG to a usable format. A 
feature selection algorithm based on correlation matrix is employed to reduce the dimension of features. The 
method used to classify persons is SIMCA (Soft Independent Modeling of Class Analogy). The SIMCA 
model will find similarities between test objects and classes rather than identical behavior. The experiment 
tested 20 persons and 100% identification rate was achieved by using empirically selected features. Lack of 
automatic identification is the major drawback. 

Saechia el al. (2005) proposed a human identification system using Fourier transform of ECG signal as 
feature exUaction tool. Once the ECG signals are normalized to be based on the same heart rate, three 
subsequences are divided and corresponded to P. QRS, and T waves, respectively. From the resulted Fourier 
coefficients, only significant elements are selected and employed in neural network for classification. Among 
the using database, their experiment results show that the proposed system can identify 31 strangers of 35 
individuals. 

Wang el al. (2008) presents a systematic analysis for human identification from ECG data. A 
fiducial-detection-based framework that incorporates analytic and appearance atuibutes is introduced. 
Existing solutions for ECG signals recognition are based on temporal and amplitude distances between 
detected fiducial points. Such method heavily relies on the accuracy of fiducial detection. To completely 
relax the detection of fiducial points, a new approach based autocorrelation in conjunction with discrete 
cosine Uansform is proposed. Two public ECG databases (PTB and MIT-BIH) are used. Experiment results 
show the proposed framework can achieve 100% subject/individual identification. 


3. PROPOSED METHOD 

Human identification is essentially a pattern recognition problem which basically involves signal 
preprocessing, feature exUaction, and classification. ECG signal is one of the most important biomeUic 
atuibutes and it can be used for human identification due to the fact that different individuals have different 
physiological and geometrical hearts, which displays certain uniqueness in their ECG signals. 

ECG signals are the recordings of the elecUical activity of the heart. It can be roughly divided into phases 
of depolarization and repolarization (Biel el al, 2001). The depolarization phases correspond to the P-wave 
and QRS-wave. The repolarization phases correspond to the T-wave. A basic ECG signal cycle is shown in 
Figure 1. 
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Figure 1. Basic ECG Signal Cycle 


3.1 Preprocessing 

The raw ECG signals usually contain low and high noise components (Israel et al, 2005). The low frequency 
noise is expressed as the slope of the overall signal across multiple heartbeat traces. The high frequency noise 
is expressed as the intra-beat noise. Reference (Israel et al, 2005) points out that three fundamental 
frequencies can be identified: the 60Hz electrical noise due to power line, the 1.10Hz heartbeat information 
and 0.06Hz change in baseline electrical potential. The remainder of the frequency is a combination of other 
noise source and subject information. The goal of filtering is to remove the 0.06 and 60Hz noise while 
retaining the individual heartbeat information between 1.10Hz and 40Hz. In this system, Butterworth band- 
pass filter is selected to perform noise filtering. The cutoff frequencies of the filter is lHz-40Hz based on 
empirical results. 

Noise filtering preprocessing of ECG signal is to minimize the negative effects of noises. Figure 2 gives a 
graphic illustration of the applied preprocessing procedure. 


Before filtering 



time 

Figure 2. ECG signal preprocessing 


3.2 Global Shape Feature 

The ECG is non-periodic but highly repetitive signal. For one same person, the shape of R-R intervals of 
ECG signal is nearly the same. But for different people, the shape of R-R intervals is also some kind 
different. Global shape features are extracted based on this attribute. After preprocessing of one ECG signal. 
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R points are found in the signal, and then ten R-R intervals are cut from the ECG signal to average into one 
interval. 

Assume one person has n normal R-R intervals denoted as 5 = {Si, 52,53, ,S«} , then the average R-R 

length of these n normal R-R intervals calculated and denoted as p = 'y " | S, \ /n, where | S; I is the length 

of Si ■ If I Si \> ju , this R-R interval should be compress and the position of deleting point is 
at [_l Si I /(I Si I —ju) + lj ; if I St \< /i ■ this R-R interval should be fill with mean value of two points and one 
point is right before the position |_l Si I / // — (I Si I) + lj and one is right after the position. Make sure length 
of all the n R-R intervals are equal to )l now. We get the global shape feature G by Gi — ^ | Sij , where 
1=1, 2, 3,...., jd . Figure 3 shows the basic diagram of global shape feature extraction. 
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Figure 3. Diagram of global shape feature extraction 


3.3 Local Statistical Feature 

Local statistical feature are extracted based on statistical counting and ranking of binary patterns that 
converted by ECG signal samples. According to (Fufu and Tseng, 2012), some advantages of statistical based 
algorithm are: there is no need for QRS detection while running the algorithm and the result may still be 
robust to dynamic variation of ECG signals; variations of the length and the sampling rate of matching 
signals are allowed; the algorithm performs rapidly with low computational complexity. 

Consider ECG signal as S = {xi, x,} , where X, corresponds to the ith input data. An 

interval-distance-set between X and X, is denoted as/ = {/i,/ 2 , //} , where all L in I is integer and 

represent as a distance. According to interval-distance-set, compare each pair of consecutive input signals 
and categorise the data into one of the two cases: a decrease or increase in Xi . A preliminary reduced 
function then maps these two cases to 0 or 1, respectively, according to (1): 

rO,Xi+i P <Xi,l<i<n 

y=\ (i) 

V.\,Xi*ip>Xi,\<p<l 

Equation (1) converts the ECG signal of length n to a binary sequence Y = { yi, yi,....y„ - i} of length n — 1 . 
Group every m in Y into a rank order binary sequence of length m , referred to as an m-bit word; collect all 
such words to form a rank order binary pattern B = {bt,bi,.bk,...b„ - m] where ht = { w, yi + 1 ,....,yt + m - 1 } . We 
then convert each m -bit word hi to its decimal expansion vw . Next, count the occurrences of all wt and sort 
them in order of descending frequency. Fork = 1,2 ,..../? — in , define j = m . It is obvious that values of j 
range from 0 to 2 m -l Let p( j) be the corresponding relative frequency of j , p( j) = n, l(n - m) and 

E 1 " 11 rii = n — m , and it is the local statistical feature. 

7=o 
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4. HIERARCHICAL CLASSIFICATION STRUCTURE 

Global shape features and Local statistical features are two complementary representations of the 
characteristics of the ECG signals. An efficient integration of these two kinds of features will enhance the 
identification performance. 

4.1 Back Propagation Neural Network 

Back propagation, an abbreviation for "backward propagation of errors", is a common method of training 
artificial neural networks. The BPNN (Back Propagation Neural Network) algorithm learns the weights for a 
multilayer network, given a network with a fixed set of units and interconnections. It employs gradient 
descent to attempt to minimize the squared error between the network output values and the target values for 
those outputs. 

Each training example is a pair of the form < x, t > , where X is the vector of network input values, and 
t is the vector of target network output values and // is the learning rate(e.g., 0.05). We denoted n„ as the 
number of network input, the number of units in the hidden layer, and n„„, the number of output units. 
The input from unit i into unit j is denoted X, , and the weight from unit i to unit j is denoted VV„ . First we 
create a feed-forward network with ru, inputs, hidden units, and output units. Initialize all network 
weights to small random numbers. For each < x,t > in training examples, we do propagate the input 

forward through the network: Input the instance X to the network and compute the output o„ of every unit u 
in the network. The sigmoid unit first computes a linear combination of its inputs, and then applies a 
threshold to the result. In the case of the sigmoid unit, however, the threshold output is a continuous function 

of its input. More precisely, the sigmoid unit computes its output o as o = a(w • x) where cr( y) = l/(l + e ' ) 

. Then propagate the errors backward through the network. For each network output unit k, calculate its error 
term & = a(l - a)(ft - a) . For each hidden unit h , calculate its error term 5, = a,( 1 — Oi,)^ k UB Wki,& , then 

update each network weight VV, = H> + Aw, where Aw,<(n) = IjSjX, + CcAw,,(n — 1) . This is called 
adding momentum, which is a common way in weight-update rule. 

4.2 Genetic Algorithm Based on Back Propagation Neural Network 
(GA-BPNN) 

In recent years, genetic algorithm based on artificial neural network model as an objective or fitness function 
has been applied successfully in optimizing the input space of various bioprocess studies (Zhang el al, 2007). 
Genetic algorithm is an artificial intelligence -based stochastic non-linear optimization technique which 
solves optimization problems based on natural selection, the process that drives biological evolution. Using 
genetic algorithm is capable of finding both the weights and the architecture of a neural network, including 
number of layers, the processing elements per layer and the connectivity between processing elements. 

4.3 Hierarchical Classification Structure 

To better utilize the complementary characteristics of global shape feature and local statistical feature, a two 
hierarchical classification structure have been adopted mainly spired by the idea of changing large class 
number problem to small class number problem. In pattern recognition, when the number of classes is large, 
the boundaries between different classes tend to be complex and hard to separate. It will be easier if we can 
reduce the possible number of classes and perform classification in a smaller scope (Wang el al, 2008). Using 
a hierarchical architecture, we can first classify the input into a few potential classes, and a second-level 
classification can be performed within these candidates. 
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Figure 4 is the basic chart of the two hierarchical architectures. In the first step, we could use the global 
shape feature for classification using GA-BPNN. During this step, most unrelated subjects are filtered. If all 
the test samples can be classified as one subject, then the first GA-BPNN classifier can output this result 
directly. Otherwise, the local statistical features for classification using GA-BPNN filters the rest subjects. 

This classification structure maps global classification into local classification and reduces the complexity 
and difficulty. Such hierarchical architecture can be applied to other pattern recognition problems as well. 



Global 

Shape 

Feature 


GA-BPNN 




Results 


a 


Figure 4. Hierarchical Classification Structure 


5. EXPERIMENT AND RESULT 

To evaluate the performance of our proposed methods, we conducted our experiments on two sets of public 
databases: PTB (Bousseljot et al, 1995) and MIT-BIH (Goldberger et al, 2000). The PTB database is offered 
from the National Metrology Institute of Germany and it contains 549 records from 294 subjects. Each record 
of the PTB database consists of the conventional 12-leads and 3 Frank leads ECG. The signals were sampled 
at 1000 Hz with a resolution of 0.5pV. The criteria for data selection are healthy ECG waveforms and at least 
two recordings for each subject. We randomly select 38 subjects from the total 294 subjects. The MIT-BIH 
Normal Sinus Rhythm Database contains 18 ECG recordings from different subjects. The recordings of the 
MIT database were collected at the Arrhythmia Laboratory of Boston’s Beth Israel Hospital. The MIT- BIH 
Normal Sinus Rhythm Database was sampled at 128 Hz. 

We design our experiment by using nearest neighbor (NN) classifier, GA-BPNN, and hierarchical 
classifier, respectively. Either global shape feature or local statistical feature is used for single classifier. 
Combined those two features can work as a hierarchical classifier. In the Figure 5, G/L-NN represent for 
global shape feature/local statistical feature for NN classifier; G/L-GABPNN represent for global shape 
feature/local statistical feature for GA-BPNN; NN+GABPNN represent for using hierarchical structure with 
global shape feature for NN classifier and local statistical feature for GA-BPNN. 

Experiment results show that for 38 subjects of PTB with identification accuracy rate 97% and 18 
subjects of MIT-BIH with identification accuracy rate 100%. Both are get their best result when using 
hierarchical classification structure. 

While compared to other similar methods, experiment results of the method we proposed show it can 
achieve reliable identification accuracy. The RBP method (Fufu and Tseng, 2012) can reach 95.791% in the 
identification accuracy at its best. The RBP method is similar to the local statistical feature extraction process 
and the difference is that we use a set of intervals other that the interval 1. In (Fufu and Tseng, 2012), a 
weighted distance formula (2) is defined to measure the similarity of two ECG signals: 

r, re. ’ 1 “ ^2 (Wt ) 1 Pi (W^P 2 (Wt ) 

^ V‘2 1, .5 -\) — ,(2 m — 1) 

(2" -1)XL= 0 
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where pi(wi) and R (wi) represent the probability and ranking of Wk in the sequence .S', ,i= 1 or 2. The 
absolute difference between two rankings is multiplied by the normalized probabilities as a weighted sum; 
the factor 2 m -l in the denominator is to ensure all values of D,„ lie between 0 and 1. 

The AC/DCT method (Wang el al, 2008) is a similar hierarchical classification structure using LDA 
classifier and nearest neighbor classifier. Wang et al (2008) proposed a feature extraction method without 
fiducial detection based on a combination of autocorrelation and discrete cosine transform. The AC/DCT 
method involves four stages: (1) windowing, where the preprocessed ECG trace is segmented into 
non-overlapping windows, with the only restriction that the window has to be longer than the average 
heartbeat length so that multiple pulses are included; (2) estimation of the normalized autocorrelation of each 
window; (3) discrete cosine transform over L lags of the autocorrelation signal; and (4) classification based 
on significant coefficients of DCT. The AC/DCT method offers 94.47% and 97.8% window recognition rate 
for the PTB and MIT-BIH datasets, respectively. The comparison is shown as Figure 6. 
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Figure 5. Comparison of experiment results 
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Figure 6. Comparison with other methods 


6. CONCLUSION 

This paper proposes a human identification system using global shape features and local statistical feature of 
ECG signals. The global shape features are extracted based on the characteristic of non-periodic but highly 
repetitive of ECG signals. Differences in the shape of their ECG signals between different individuals indeed 
exist. The local statistical features taking the advantage of local difference among samples in one signal. To 
better utilize the complementary characteristic of local statistical features and global shape features, a two 
hierarchical classification structure has been adopted, which is mainly spired by the idea of changing large 
class number problem to small class number problem. Experiment results show the two combined GA-BPNN 
classifier achieved better identification accuracy for both PTB and MIT-BIH databases. The idea of global 
feature combines local feature and using a hierarchical classification can be referenced by identification 
system using other biometric features. 
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